The Mythical Man-Month Notes

Table of Contents

The Tar Pit

The Programming Systems Product

Moving down across the horizontal boundary, a program becomes a programming product. This is a program that can be run,tested, repaired, and extended by anybody. It is usable in many operating environments, for many sets of data. To become a generally usable programming product, a program must be written in a generalized fashion. In particular the range and form of inputs must be generalized as much as the basic algorithm will reasonably allow. Then the program must be thoroughly tested, so that it can be depended upon. This means that a substantial bank of test cases, exploring the input range and probing its boundaries, must be prepared, run, and recorded. Finally, promotion of a program to a programming product requires its thorough documentation, so that anyone may use it, fix it, and extend it. As a rule of thumb, I estimate that a programming product costs at least three times as much as a debugged program with the same function.

Moving across the vertical boundary, a program becomes a component in a programming system. This is a collection of interacting programs, coordinated in function and disciplined in format, so that the assemblage constitutes an entire facility for large tasks. To become a programming system component, a program must be written so that every input and output conforms in syntax and semantics with precisely defined interfaces. The program must also be designed so that it uses only a prescribed budget of resources— memory space, input-output devices, computer time. Finally, the program must be tested with other system components, in all expected combinations. This testing must be extensive, for the number of cases grows combinatorially. It is time-consuming, for subtle bugs arise from unexpected interactions of debugged components. A programming system component costs at least three times as much as a stand-alone program of the same function. The cost may be greater if the system has many components.

In the lower right-hand corner of Fig. 1.1 stands the programming systems product. This differs from the simple program in all of the above ways. It costs nine times as much. But it is the truly useful object, the intended product of most system programming efforts.

The Joys of the Craft

First is the sheer joy of making things

Second is the pleasure of making things that are useful to other people.

Third is the fascination of fashioning complex puzzle-like objects of interlocking moving parts and watching them work in subtle cycles, playing out the consequences of principles built in from the beginning.

Fourth is the joy of always learning, which springs from the nonrepeating nature of the task.

Finally, there is the delight of working in such a tractable medium.

The Woes of the Craft

First, one must perform perfectly. The computer resembles the magic of legend in this respect, too. If one character, one pause, of the incantation is not strictly in proper form, the magic doesn't work.

Next, other people set one's objectives, provide one's resources, and furnish one's information. One rarely controls the circumstances of his work, or even its goal.

The next woe is that designing grand concepts is fun; finding nitty little bugs is just work. With any creative activity come dreary hours of tedious, painstaking labor, and programming is no exception.

Next, one finds that debugging has a linear convergence, or worse, where one somehow expects a quadratic sort of approach to the end. So testing drags on and on, the last difficult bugs taking more time to find than the first.

The last woe, and sometimes the last straw, is that the product over which one has labored so long appears to be obsolete upon (or before) completion. Already colleagues and competitors are in hot pursuit of new and better ideas. Already the displacement of one's thought-child is not only conceived, but scheduled.

The Mythical Man-Month

More software projects have gone awry for lack of calendar time than for all other causes combined. Why is this cause of disaster so common?

First, our techniques of estimating are poorly developed.

Second, our estimating techniques fallaciously confuse effort with progress, hiding the assumption that men and months are interchangeable.

Third, because we are uncertain of our estimates, software managers often lack the courteous stubbornness of Antoine's chef.

Fourth, schedule progress is poorly monitored.

Fifth, when schedule slippage is recognized, the natural (and traditional) response is to add manpower.

For some years I have been successfully using the following rule of thumb for scheduling a software task:

l/3 planning
l/6 coding
l/4 component test and early system test
l/4 system test, all components in hand.

This differs from conventional scheduling in several important ways:

  1. The fraction devoted to planning is larger than normal. Even

so, it is barely enough to produce a detailed and solid specification, and not enough to include research or exploration of totally new techniques.

  1. The half of the schedule devoted to debugging of completed

code is much larger than normal.

  1. The part that is easy to estimate, i.e., coding, is given only

one-sixth of the schedule.

The Surgical Team

Mills's Proposal

Mills proposes that each segment of a large job be tackled by a team, but that the team be organized like a surgical team rather than a hog-butchering team. That is, instead of each member cutting away on the problem, one does the cutting and the others give him every support that will enhance his effectiveness and productivity.

The surgeon. Mills calls him a chief programmer. He personally defines the functional and performance specifications, designs the program, codes it, tests it, and writes its documentation.

The copilot. He is the alter ego of the surgeon, able to do any part of the job, but is less experienced. His main function is to share in the design as a thinkerrdiscussant, and evaluator. The surgeon tries ideas on him, but is not bound by his advice. The copilot often represents his team in discussions of function and interface with other teams. He knows all the code intimately. He researches alternative design strategies. He obviously serves as insurance against disaster to the surgeon. He may even write code, but he is not responsible for any part of the code.

The administrator. The surgeon is boss, and he must have the last word on personnel, raises, space, and so on, but he must spend almost none of his time on these matters. Thus he needs a professional administrator who handles money, people, space, and machines, and who interfaces with the administrative machinery of the rest of the organization. Baker suggests that the administrator has a full-time job only if the project has substantial legal, contractual, reporting, or financial requirements because of the userproducer relationship. Otherwise, one administrator can serve two teams.

The editor. The surgeon is responsible for generating the documentation— for maximum clarity he must write it. This is true of both external and internal descriptions. The editor, however, takes the draft or dictated manuscript produced by the surgeon and criticizes it, reworks it, provides it with references and bibliography, nurses it through several versions, and oversees the mechanics of production.

Two secretaries. The administrator and the editor will each need a secretary; the administrator's secretary will handle project correspondence and non-product files.

The program clerk. He is responsible for maintaining all the technical records of the team in a programming-product library. The clerk is trained as a secretary and has responsibility for both machine-readable and human-readable files.

The toolsmith. File-editing, text-editing, and interactive debugging services are now readily available, so that a team will rarely need its own machine and machine-operating crew. But these services must be available with unquestionably satisfactory response and reliability; and the surgeon must be sole judge of the adequacy of the service available to him. He needs a toolsmith, responsible for ensuring this adequacy of the basic service and for constructing, maintaining, and upgrading special tools—mostly interactive computer services—needed by his team. Each team will need its own toolsmith, regardless of the excellence and reliability of any centrally provided service, for his job is to see to the tools needed or wanted by his surgeon, without regard to any other team's needs. The tool-builder will often construct specialized utilities, catalogued procedures, macro libraries.

The tester. The surgeon will need a bank of suitable test cases for testing pieces of his work as he writes it, and then for testing the whole thing.

The language lawyer. By the time Algol came along, people began to recognize that most computer installations have one or two people who delight in mastery of the intricacies of a programming language.

How It Works

Notice in particular the differences between a team of two programmers conventionally organized and the surgeon-copilot team. First, in the conventional team the partners divide the work, and each is responsible for design and implementation of part of the work. In the surgical team, the surgeon and copilot are each cognizant of all of the design and all of the code. This saves the labor of allocating space, disk accesses, etc. It also ensures the conceptual integrity of the work.

In the surgical team, there are no differences of interest, and differences of judgment are settled by the surgeon unilaterally. These two differences—lack of division of the problem and the superior-subordinate relationship—make it possible for the surgical team to act uno animo.

Aristocracy, Democracy, and System Design

Conceptual Integrity

I will contend that conceptual integrity is the most important consideration in system design. It is better to have a system omit certain anomalous features and improvements, but to reflect one set of design ideas, than to have one that contains many good but independent and uncoordinated ideas.

Achieving Conceptual Integrity

Because ease of use is the purpose, this ratio of function to conceptual complexity is the ultimate test of system design. Nei- ther function alone nor simplicity alone defines a good design.

Simplicity and straightfor- wardness proceed from conceptual integrity. Every part must re- flect the same philosophies and the same balancing of desiderata. Every part must even use the same techniques in syntax and analogous notions in semantics. Ease of use, then, dictates unity of design, conceptual integrity.

Aristocracy and Democracy

Conceptual integrity in turn dictates that the design must proceed from one mind, or from a very small number of agreeing resonant minds.

Schedule pressures, however, dictate that system building needs many hands. Two techniques are available for resolving this dilemma. The first is a careful division of labor between architec- ture and implementation. The second is the new way of structur- ing programming implementation teams discussed in the previous chapter.

The separation of architectural effort from implementation is a very powerful way of getting conceptual integrity on very large projects.

By the architecture of a system, I mean the complete and de- tailed specification of the user interface. For a computer this is the programming manual. For a compiler it is the language manual. For a control program it is the manuals for the language or languages used to invoke its functions. For the entire system it is the union of the manuals the user must consult to do his entire job.

Architecture must be carefully distinguished from implemen- tation. As Blaauw has said, "Where architecture tells what hap- pens, implementation tells how it is made to happen."

What Does the Implementer Do While Waiting?

The architecture manager had 10 good men. He asserted that they could write the specifications and do it right.

The control program manager had 150 men. He asserted that they could prepare the specifications, with the architecture team coordinating; it would be well-done and practical, and he could do it on schedule.

As Blaauw points out, the total creative effort involves three distinct phases: architecture, implementation, and realization. It turns out that these can in fact be begun in parallel and proceed simultaneously.

In computer design, for example, the implementer can start as soon as he has relatively vague assumptions about the manual, somewhat clearer ideas about the technology, and well-defined cost and performance objectives. He can begin designing data flows, control sequences, gross packaging concepts, and so on. He devises or adapts the tools he will need, especially the record- keeping system, including the design automation system.

The Second-System Effect

Interactive Discipline for the Architect

The architect has two possible answers when confronted with an estimate that is too high: cut the design or challenge the esti- mate by suggesting cheaper implementations. This latter is inher- ently an emotion-generating activity. The architect is now challenging the builder's way of doing the builder's job. For it to be successful, the architect must • remember that the builder has the inventive and creative re- sponsibility for the implementation; so the architect suggests, not dictates; • always be prepared to suggest a way of implementing any- thing he specifies, and be prepared to accept any other way that meets the objectives as well; • deal quietly and privately in such suggestions; • be ready to forego credit for suggested improvements.

Self-Discipline—The Second-System Effect

How does the architect avoid the second-system effect? Well, obviously he can't skip his second system. But he can be conscious of the peculiar hazards of that system, and exert extra self-disci- pline to avoid functional ornamentation and to avoid extrapola- tion of functions that are obviated by changes in assumptions and purposes.

A discipline that will open an architect's eyes is to assign each little function a value: capability x is worth not more than m bytes of memory and n microseconds per invocation. These values will guide initial decisions and serve during implementation as a guide and warning to all.

Passing the Word

Written Specifications—the Manual

I think the finest piece of manual writing I have ever seen is Blaauw's Appendix to System/360 Principles of Operation. This de- scribes with care and precision the limits of System/360 compati- bility. It defines compatibility, prescribes what is to be achieved, and enumerates those areas of external appearance where the ar- chitecture is intentionally silent and where results from one model may differ from those of another, where one copy of a given model may differ from another copy, or where a copy may differ even from itself after an engineering change. This is the level of preci- sion to which manual writers aspire, and they must define what is not prescribed as carefully as what is.

Formal Definitions

English, or any other human language, is not naturally a precision instrument for such definitions. Therefore the manual writer must strain himself and his language to achieve the precision needed. An attractive alternative is to use a formal notation for such definitions.

An ancient adage warns, "Never go to sea with two chronom- eters; take one or three." The same thing clearly applies to prose and formal definitions. If one has both, one must be the standard, and the other must be a derivative description, clearly labeled as such. Either can be the primary standard. Algol 68 has a formal definition as standard and a prose definition as descriptive. PL/I has the prose as standard and the formal description as derivative.

Many tools are available for formal definition. The Backus- Naur Form is familiar for language definition, and it is amply discussed in the literature.1 The formal description of PL/I uses new notions of abstract syntax, and it is adequately described.2 Iverson's APL has been used to describe machines, most notably the IBM 70903 and System/360.4

Bell and Newell have proposed new notations for describing both configurations and machine architectures, and they have il- lustrated these with several machines, including the DEC PDP-8,5 the 70905, and System/360.6

t is an implementation; it runs. So all questions of definition can be resolved by testing it.

Using an implementation as a definition has some advantages. All questions can be settled unambiguously by experiment. De- bate is never needed, so answers are quick. Answers are always as precise as one wants, and they are always correct, by definition.

The implementation as a definition overprescribed; it not only said what the machine must do, it also said a great deal about how it had to do it.

Then, too, the implementation will sometimes give unex- pected and unplanned answers when sharp questions are asked, and the de facto definition will often be found to be inelegant in these particulars precisely because they have never received any thought.

Finally, the use of an implementation as a formal definition is peculiarly susceptible to confusion as to whether the prose de- scription or the formal description is in fact the standard. This is especially true of programmed simulations.

Direct Incorporation

It is especially useful for establishing the syntax, if not the semantics, of intermodule interfaces. This technique is to design the declaration of the passed parameters or shared storage, and to require the implementations to include that declaration via a compile-time operation (a macro or a % INCLUDE in PL/I).

Conferences and Courts

A new problem is usually discussed a while. The emphasis is on creativity, rather than merely decision. The group attempts to invent many solu- tions to problems, then a few solutions are passed to one or more of the architects for detailing into precisely worded manual change proposals.

The fruitfulness of these meetings springs from several sources:

  1. The same group—architects, users, and implementers—meets weekly for months. No time is needed for bringing people up to date.
  2. The group is bright, resourceful, well versed in the issues, and deeply involved in the outcome. No one has an "advisory" role. Everyone is authorized to make binding commitments.
  3. When problems are raised, solutions are sought both within and outside the obvious boundaries.
  4. The formality of written proposals focuses attention, forces decision, and avoids committee-drafted inconsistencies.
  5. The clear vesting of decision-making power in the chief architect avoids compromise and delay.

The Telephone Log

One useful mechanism is a telephone log kept by the architect. In it he records every question and every answer. Each week the logs of the several architects are concatenated, reproduced, and distributed to the users and implementers. While this mechanism is quite informal, it is both quick and comprehensive.

Product Test

Time after time, the careful product tester will find places where the word didn't get passed, where the design deci- sions were not properly understood or accurately implemented. For this reason such a testing group is a necessary link in the chain by which the design word is passed, a link that needs to operate early and simultaneously with design.

Why Did the Tower of Babel Fail?

A Management Audit of the Babel Project

In two respects—communication, and its con- sequent, organization. They were unable to talk with each other; hence they could not coordinate. When coordination failed, work ground to a halt. Reading between the lines we gather that lack of communication led to disputes, bad feelings, and group jeal- ousies. Shortly the clans began to move apart, preferring isolation to wrangling.

Communication in the Large Programming Project

How, then, shall teams communicate with one another? In as many ways as possible.

  • Informally. Good telephone service and a clear definition of intergroup dependencies will encourage the hundreds of calls upon which common interpretation of written documents depends.
  • Meetings. Regular project meetings, with one team after another giving technical briefings, are invaluable. Hundreds of minor misunderstandings get smoked out this way.
  • Workbook. A formal project workbook must be started at the beginning. This deserves a section by itself.

The Project Workbook

What. All the documents of the project need to be part of this struc- ture. This includes objectives, external specifications, interface specifications, technical standards, internal specifications, and ad- ministrative memoranda.

Why. Technical prose is almost immortal. If one examines the genealogy of a customer manual for a piece of hardware or soft- ware, one can trace not only the ideas, but also many of the very sentences and paragraphs back to the first memoranda proposing the product or explaining the first design.

it is very important to get the structure of the documentation right. The early design of the project workbook ensures that the documentation structure itself is crafted, not haphazard. Moreover, the establishment of a struc- ture molds later writing into segments that fit into that structure.

The second reason for the project workbook is control of the distribution of information.

Mechanics. First, one must mark changed text on the page, e.g., by a vertical bar in the margin alongside every altered line. Second, one needs to distribute with the new pages a short, separately written change summary that lists the changes and remarks on their significance.

On balance I think the microfiche was a very happy mecha- nism, and I would recommend it over a paper workbook for very large projects.

How would one do it today? With today's system technology available, I think the technique of choice is to keep the workbook on the direct-access file, marked with change bars and revision dates. Each user would consult it from a display terminal

Organization in the Large Programming Project

If there are n workers on a project, there are (n2-n)/2 interfaces across which there may be communication, and there are poten- tially almost 2n teams within which coordination must occur. The purpose of organization is to reduce the amount of communication and coordination necessary; hence organization is a radical attack on the communication problems treated above.

The means by which communication is obviated are division of labor and specialization of function. The tree-like structure of orga- nizations reflects the diminishing need for detailed communica- tion when division and specialization of labor are applied.

Let us consider a tree-like programming organization, and examine the essentials which any subtree must have in order to be effective. They are:

  1. a mission
  2. a producer
  3. a technical director or architect
  4. a schedule
  5. a division of labor
  6. interface definitions among the parts

What is the role of the producer? He assembles the team, divides the work, and establishes the schedule. He acquires and keeps on acquiring the necessary resources.This means that a major part of his role is communication outside the team, upwards and sideways. He establishes the pattern of communication and reporting within the team. Finally, he ensures that the schedule is met, shifting resources and organization to respond to changing circumstances.

How about the technical director? He conceives of the design to be built, identifies its subparts, specifies how it will look from outside, and sketches its internal structure. He provides unity and conceptual integrity to the whole design; thus he serves as a limit on system complexity. As individual technical problems arise, he invents solutions for them or shifts the system design as required.

The producer and the technical director may be the same man. This is readily workable on very small teams, perhaps three to six programmers. On larger projects it is very rarely workable, for two reasons. First, the man with strong management talent and strong technical talent is rarely found. Thinkers are rare; doers are rarer; and thinker-doers are rarest.

The producer may be boss, the director his right-hand man.

The director may be boss, and the producer his right-hand man.

Calling the Shot

Practice is the best of all instructors.
                             PUBUUUS
Experience is a dear teacher, but fools will learn at no other.b
                         POOR RICHARD'S ALMANAC

How long will a system programming job take? How much effort will be required? How does one estimate?

First, one must say that one does not estimate the entire task by estimating the coding portion only and then applying the ratios. The coding is only one-sixth or so of the problem, and errors in its estimate or in the ratios could lead to ridiculous'results.

Second, one must say that data for building isolated small programs are not applicable to programming systems products. Planning, documentation, testing, system integration, and training times must be added. The linear extrapolation of such sprint figures is meaningless. Extrapolation of times for the hun- dred-yard dash shows that a man can run a mile in under three minutes.

Before dismissing them, however, let us note that these num- bers, although not for strictly comparable problems, suggest that effort goes as a power of size even when no communication is involved except that of a man with his memories.

It illustrates results reported from a study done by Nanus and Farr7 at System Development Corpo- ration. This shows an exponent of 1.5; that is,

effort = (constant) X (number of instructions)1.5.

Another SDC study reported by Weinwurm3 also shows an expo- nent near 1.5.

Morin has prepared a survey of the published data.8

Ten Pounds in a Five-Pound Sack

Program Space as Cost

Since size is such a large part of the user cost of a programming system product, the builder must set size targets, control size, and devise size-reduction techniques, just as the hardware builder sets component-count targets, controls component count, and devises count-reduction techniques. Like any cost, size itself is not bad, but unnecessary size is.

Size Control

First, setting size targets for core is not enough; one has to budget all aspects of size.

So the second moral is also clear: Define exactly what a module must do when you specify how big it must be.

A third and deeper lesson shows through these experiences. All during implementation, the system architects must maintain continual vigilance to ensure con- tinued system integrity. Beyond this policing mechanism, how- ever, lies the matter of attitude of the implementers themselves. Fostering a total-system, user-oriented attitude may well be the most important function of the programming manager.

Space Techniques

Obviously, more function means more space, speed being held constant. So the first area of craftsmanship is in trading function for size. One can design a program with many optional features, each of which takes a little space. One can design a generator that will take an option Jist and tailor a program to it. So the designer must decide how fine-grained the user choice of options will be.

The second area of craftsmanship is space-time trade-offs. For a given function, the more space, the faster. This is true over an amazingly large range. It is this fact that makes it feasible to set space budgets.

The manager can do two things to help his team make good space-time trade-offs. One is to ensure that they are trained in programming technique, not just left to rely on native wit and previous experience.

The second is to recognize that programming has a technol- ogy, and components need to be fabricated. Every project needs a notebook full of good subroutines or macros for queuing, search- ing, hashing, and sorting. For each such function the notebook should have at least two programs, the quick and the squeezed.

Representation Is the Essence of Programming

The programmer at wit's end for lack of space can often do best by disentangling himself from his code, rearing back, and contemplating his data. Representation is the essence of program- ming.

The Documentary Hypothesis

Documents for a Computer Product

Suppose one is building a machine. What are the critical documents?

  • Objectives. This defines the need to be met and the goals, desiderata, constraints, and priorities.
  • Specifications. This is a computer manual plus performance specifications. It is one of the first documents generated in propos- ing a new product, and the last document finished.
  • Schedule
  • Budget. Not merely a constraint, the budget is one of the manager's most useful documents. Existence of the budget forces technical decisions that otherwise would be avoided; and, more important, it forces and clarifies policy decisions.
  • Organization chart
  • Space allocations
  • Estimate, forecast, prices. These three have cyclic Interlocking, which determines the success or failure of the project.

Documents for a University Department

Objectives
Course descriptions
Degree requirements
Research proposals (hence plans, when funded)
Class schedule and teaching assignments
Budget
Space allocation
Assignment of staff and graduate students

Documents for a Software Project

  • What: objectives. This defines the need to be met and the goals, desiderata, constraints, and priorities.
  • What: product specifications. This begins as a proposal and ends up as the manual and internal documentation. Speed and space specifications are a critical part.
  • When: schedule
  • How much: budget
  • Where: space allocation
  • Who: organization chart.

Why Have Formal Documents?

First, writing the decisions down is essential. The act of writing turns out to require hundreds of mini-decisions, and it is the existence of these that distinguishes clear, exact policies from fuzzy ones.

Second, the documents will communicate the decisions to others.

Finally, a manager's documents give him a data base and checklist.

Plan to Throw One Away

Pilot Plants and Scaling Up

Hence plan to throw one away; you will, anyhow.

The Only Constancy Is Change Itself

Nevertheless, some changes in objectives are inevitable, and it is better to be prepared for them than to assume that they won't come. Not only are changes in objective inevitable, changes in development strategy and technique are also inevitable. The throw-one-away concept is itself just an acceptance of the fact that as one learns, he changes the design.9

Plan the System for Change

Quantization of change is an essential technique. Every prod- uct should have numbered versions, and each version must have its own schedule and a freeze date, after which changes go into the next version.

Plan the Organization for Change

Cosgrove advocates treating all plans, milestones, and schedules as tentative, so as to facilitate change. This goes much too far—the common failing of programming groups today is too little manage- ment control, not too much.

Structuring an organization for change is much harder than designing a system for change. Each man must be assigned to jobs that broaden him, so that the whole force is technically flexible.

One Step Forward and One Step Back

They find that the total number of modules increases linearly with release number, but that the num- ber of modules affected increases exponentially with release num- ber. All repairs tend to destroy the structure, to increase the entropy and disorder of the system. Less and less effort is spent on fixing original design flaws; more and more is spent on fixing flaws introduced by earlier fixes. As time passes, the system becomes less and less well-ordered. Sooner or later the fixing ceases to gain any ground. Each forward step is matched by a backward one.

Sharp Tools

Both specialized needs and personal preferences dictate the need for special- ized tools as well; so in discussing programming teams I have postulated one toolmaker per team. This man masters all the com- mon tools and is able to instruct his client-boss in their use. He also builds the specialized tools his boss needs.

What are the tools about which the manager must philoso- phize, plan, and organize? First, a computer facility. This requires machines, and a scheduling philosophy must be adopted. It requires an operating system, and service philosophies must be estab- lished. It requires language, and a language policy must be laid down. Then there are utilities, -debugging aids, test-case generators, and a text-processing system to handle documentation. Let us look at these one by one.10

Target Machines

Machine support is usefully divided into the target machine and the vehicle machines.

Scheduling.

System debugging has always been a graveyard-shift occupation, like astronomy.

Vehicle Machines and Data Services

Simulators. If the target computer is new, one needs a logical simulator for it. This gives a debugging vehicle long before the real target exists. Equally important, it gives access to a dependable debugging vehicle even after one has a target machine available.

So a dependable simulator on a well-aged vehicle retains its usefulness far longer than one would expect.

Compiler and assembler vehicles.

For the same reasons, one wants compilers and assemblers that run on dependable vehicles but compile object code for the target system. This can then start being debugged on the simulator.

Program libraries and accounting.

First, each group or programmer had an area where he kept copies of his programs, his test cases, and the scaffolding he needed for component testing. In this playpen area there were no restrictions on what a man could do with his own programs; they were his.

Two notions are important here. The first is control, the idea of program copies belonging to managers who alone can authorize their change. The"second is that of formal separation and progression from the playpen, to integration, to release.

Program tools.

Documentation system.

Among all tools, the one that saves the most labor may well be a computerized text-editing system, oper- ating on a dependable vehicle.

Performance simulator.

Better have one. Build it outside-in. Start it very early. Listen to it when it speaks.

High-Level Language and Interactive Programming

  • High-level language. The chief reasons for using a high-level language are productivity and debugging speed.
  • Interactive programming. There is a widespread recognition that debugging is the hard and slow part of system programming, and slow turnaround is the bane of debugging. So the logic of interactive programming seems inexorable.

The Whole and the Parts

Designing the Bugs Out

  • Bug-proofing the definition.

    The most pernicious and subtle bugs are system bugs arising from mismatched assumptions made by the authors of various components. Careful function definition, careful specification, and the disciplined exorcism of frills of function and flights of technique all reduce the number of system bugs that have to be found.

  • Testing the specification.

    Long before any code exists, the spec- ification must be handed to an outside testing group to be scruti- nized for completeness and clarity.

  • Top-down design.

    Niklaus Wirth formalized a design procedure which had been used for years by the best programmers.11 Briefly, Wirth's procedure is to identify design as a sequence of refinement steps.

    From this process one identifies modules of solution or of data whose further refinement can proceed independently of other work. The degree of this modularity determines the adaptability and changeability of the program.

    A good top-down design avoids bugs in several ways. First, the clarity of structure and representation makes the precise state- ment of requirements and functions of the modules easier. Second, the partitioning and independence of modules avoids system bugs. Third, the suppression of detail makes flaws in the structure more apparent. Fourth, the design can be tested at each of its refinement steps, so testing can start earlier and focus on the proper level of detail at each step.

  • Structured programming.

    Another important set of new ideas for designing the bugs out of programs derives largely from Dijkstra12, and is built on a theoretical structure by Bohm and Jacopini13.

    The important point, and the one vital to constructing bug- free programs, is that one wants to think about the control struc- tures of a system as control structures, not as individual branch statements. This way of thinking is a major step forward.

Component Debugging

  • On-machine debugging.
  • Memorydumps.
  • Snapshots.

    So people developed techniques for selective dumping, selective tracing, and for inserting snapshots into programs.

  • Interactive debugging.

System Debugging

The unexpectedly hard part of building a programming system is system test. one should be convinced of two things: system debugging will take longer than one expects, and its difficulty justifies a thoroughly system- atic and planned approach. Let us now see what such an approach involves.14

  • Use debugged components.
  • Build plenty of scaffolding.

    By scaffolding I mean all programs and data built for debugging purposes but never intended to be in the final product.

    One form of scaffolding is the dummy component, which con- sists only of interfaces and perhaps some faked data or some small test cases.

    Another form is the miniature file. A very common form of system bug is misunderstanding of formats for tape and disk files. So it is worthwhile to build some little files that have only a few typical records, but all the descriptions, pointers, etc.

    Yet another form of scaffolding are auxiliary programs. Gener- ators for test data, special analysis printouts, cross-reference table analyzers, are all examples of the special-purpose jigs and fixtures one may want to build.15

  • Control changes.

    Tight control during test is one of the impres- sive techniques of hardware debugging, and it applies as well to software systems.

    First, somebody must be in charge. He and he alone must authorize component changes or substitution of one version for another.

    Then, as discussed above, there must be controlled copies of the system: one locked-up copy of the latest versions, used for component testing; one copy under test, with fixes being installed; playpen copies where each man can work away on his component, doing both fixes and extensions.

    Programming needs a purple-wire technique, and it badly needs tight control and deep respect for the paper that ultimately is the product. The vital ingredients of such technique are the logging of all changes in a journal and the distinction, carried conspicuously in source code, between quick patches and thought-through, tested, documented fixes.

  • Add one component at a time.

    Note that one must have thorough test cases, testing the par- tial systems after each new piece is added. And the old ones, run successfully on the last partial sum, must be rerun on the new one to test for system regression.

  • Quantize updates.

    The replacement of a working component by a new version requires the same systematic testing procedure that adding a new component does, although it should require less time, for more complete and efficient test cases will usually be available.

    But the changes need to be quantized. Then each user has periods of productive stability, interrupted by bursts of test-bed change. This seems to be much less disruptive than a constant rippling and trembling

Hatching a Catastrophe

Milestones or Millstones?

For picking the milestones there is only one relevant rule. Milestones must be concrete, specific, measurable events, defined with knife-edge sharpness.

"The Other Piece Is Late, Anyway"

he PERT technique, strictly speaking, is an elaboration of critical-path scheduling in which one estimates three times for every event, times corresponding to different probabilities of meeting the estimated dates.

Under the Rug

Two rug-lifting techniques are open to the boss. Both must be used. The first is to reduce the role conflict and inspire sharing of status. The other is to yank the rug back.

  • Reducing the role conflict.

    The boss must first distinguish be- tween action information and status information. He must disci- pline himself not to act on problems his managers can solve, and never to act on problems when he is explicitly reviewing status.

    This whole process is helped if the boss labels meetings, re- views, conferences, as status-review meetings versus problem-action meetings, and controls himself accordingly.

  • Yanking the rug off.

    Nevertheless, it is necessary to have review techniques by which the true status is made known, whether cooperatively or not. The PERT chart with its frequent sharp milestones is the basis for such review.

    A report showing milestones and actual completions is the key document.

    Everyone knows the questions, and the component manager should be prepared to explain why it's late, when it will be finished, what steps he's taking, and what help, if any, he needs from the boss or collateral groups.

The Other Face

What Documentation Is Required?

To use a program. Every user needs a prose description of the program. Most documentation fails in giving too little overview. The trees are described, the bark and leaves are commented, but there is no map of the forest. To write a useful prose description, stand way back and come in slowly:

  1. Purpose. What is the main function, the reason for the program?
  2. Environment. On what machines, hardware configurations, and operating system configurations will it run?
  3. Domain and range. What domain of input is valid? What range of output can legitimately appear?
  4. Functions realized and algorithms used. Precisely what does it do?
  5. Input-output formats,, precise and complete.
  6. Operating instructions, including normal and abnormal ending behavior, as seen at the console and on the outputs.
  7. Options. What choices does the user have about functions? Exactly how are those choices specified?
  8. Running time. How long does it take to do a problem of specified size on a specified configuration?
  9. Accuracy and checking. How precise are the answers expected to be? What means of checking accuracy are incorporated?
  • To believe a program.

    Then one needs more thorough test cases, which are normally run only after a program is modified. These fall into three parts of the input data domain:

    1. Mainline cases that test the program's chief functions for commonly encountered data.
    2. Barely legitimate cases that probe the edge of the input data domain, ensuring that largest possible values, smallest possible values, and all kinds of valid exceptions work.
    3. Barely illegitimate cases that probe the domain boundary from the other side, ensuring that invalid inputs raise proper diagnostic messages.
  • To modify a program.

    For the modifier, as well as the more casual user, the crying need is for a clear, sharp overview, this time of the internal structure. What are the components of such an overview?

    1. A flow chart or subprogram structure graph. More on this later.
    2. Complete descriptions of the algorithms used, or else references to such descriptions in the literature.
    3. An explanation of the layout of all files used.
    4. An overview of the pass structure—the sequence in which data or programs are brought from tape or disk—and what is accomplished on each pass.
    5. A discussion of modifications contemplated in the original design, the nature and location of hooks and exits, and discursive discussion of the ideas of the original author about what modifications might be desirable and how one might proceed. His observations on hidden pitfalls are also useful.

The Flow-Chart Curse

Flow charts show the decision structure of a program, which is only one aspect of its structure. They show decision structure rather elegantly when the flow chart is on one page, but the overview breaks down badly when one has multiple pages, sewed together with numbered exits and connectors.

Self-Documenting Programs

The solution, I think, is to merge the files, to incorporate the documentation in the source program. This is at once a powerful incentive toward proper maintenance, and an insurance that the documentation will always be handy to the program user. Such programs are called self-documenting.

An approach. The first notion is to use the parts of the program that have to be there anyway, for programming language reasons, to carry as much of the documentation as possible. So labels, declaration statements, and symbolic names are all harnessed to the task of conveying as much meaning as possible to the reader.

A second notion is to use space and format as much as possible to improve readability and show subordination and nesting.

The third notion is to insert the necessary prose documenta- tion into the program as paragraphs of comment. Most programs tend to have enough line-by-line comments; those programs pro- duced to meet stiff organizational standards for "good documenta- tion" often have too many.

Why not? What are the drawbacks of such an approach to docu- mentation? There are several, which have been real but are becom- ing imaginary with changing times.

The most serious objection is the increase in the size of the source code that must be stored.

Yet simultaneously we are moving also toward on-line storage of prose documents for access and for updating via computerized text-editing. As shown above, amalgamating prose and program reduces the total number of characters to be stored.

No Silver Bullet-Essence and Accident in Software Engineering

Abstract16

Therefore it appears that the time has come to address the essential parts of the software task, those concerned with fashioning abstract conceptual structures of great complexity. I suggest:

  • Exploiting the mass market to avoid constructing what can be bought.
  • Using rapid prototyping as part of a planned iteration in establishing software requirements.
  • Growing software organically, adding more and more function to systems as they are run, used, and tested.
  • Identifying and developing the great conceptual designers of the rising generation.

Introduction

There is no single development, in either technology or management technique, which by itself promises even one order of magnitude improvement in productivity, in reliability, in simplicity.

The first step toward the management of disease was replacement of demon theories and humours theories by the germ theory. That very step, the beginning of hope, in itself dashed all hopes of magical solutions. It told workers that progress would be made stepwise, at great effort, and that a persistent, unremitting care would have to be paid to a discipline of clean- liness. So it is with software engineering today.

Does It Have to Be Hard?—Essential Difficulties

First, we must observe that the anomaly is not that software progress is so slow but that computer hardware progress is so fast. No other technology since civilization began has seen six orders of magnitude price-performance gain in 30 years. In no other technology can one choose to take the gain in either improved performance or in reduced costs. These gains flow from the transformation of computer manufacture from an assembly industry into a process industry.

Second, to see what rate of progress we can expect in software technology, let us examine its difficulties. Following Aristotle, I divide them into essence–the difficulties inherent in the nature of the software–and accidents—-those difficulties that today attend its production but that are not inherent.

The essence of a software entity is a construct of interlocking concepts: data sets, relationships among data items, algorithms, and invocations of functions. This essence is abstract, in that the conceptual construct is the same under many different representations. It is nonetheless highly precise and richly detailed.

I believe the hard part of building software to be the specification, design, and testing of this conceptual construct, not the labor of representing it and testing the fidelity of the representation. We still make syntax errors, to be sure; but they are fuzz compared to the conceptual errors in most systems.

If this is true, building software will always be hard. There is inherently no silver bullet.

Let us consider the inherent properties of this irreducible essence of modern software systems: complexity, conformity, changeability, and invisibility.

  • Complexity.

    Software entities are more complex for their size than perhaps any other human construct, because no two parts are alike (at least above the statement level). If they are, we make the two similar parts into one, a subroutine, open or closed.

    Likewise, a scaling-up of a software entity is not merely a repetition of the same elements in larger size; it is necessarily an increase in the number of different elements. In most cases, the elements interact with each other in some nonlinear fashion, and the complexity of the whole increases much more than lin- early.

    Many of the classical problems of developing software prod- ucts derive from this essential complexity and its nonlinear in- creases with size. From the complexity comes the difficulty of communication among team members, which leads to prod- uct flaws, cost overruns, schedule delays. From the complexity comes the difficulty of enumerating, much less understanding, all the possible states of the program, and from that comes the unreliability. From the complexity of the functions comes the difficulty of invoking those functions, which makes programs hard to use. From complexity of structure comes the difficulty of extending programs to new functions without creating side effects. From complexity of structure comes the unvisualized states that constitute security trapdoors.

    Not only technical problems but management problems as well come from the complexity. This complexity makes overview hard, thus impeding conceptual integrity. It makes it hard to find and control all the loose ends. It creates the tremendous learning and understanding burden that makes personnel turnover a disaster.

  • Conformity.

    Einstein repeatedly argued that there must be simplified explanations of nature, because God is not capricious or arbitrary.

    No such faith comforts the software engineer. Much of the complexity he must master is arbitrary complexity, forced with- out rhyme or reason by the many human institutions and systems to which his interfaces must conform. These differ from interface to interface, and from time to time, not because of necessity but only because they were designed by different people, rather than by God.

    In many cases the software must conform because it has most recently come to the scene. In others it must conform be- cause it is perceived as the most conformable. But in all cases, much complexity comes from conformation to other interfaces; this cannot be simplified out by any redesign of the software alone.

  • Changeability.

    The software entity is constantly subject to pressures for change.

    Partly this is because the software in a system embodies its function, and the function is the part that most feels the pressures of change. Partly it is because software can be changed more easily—it is pure thought-stuff, infinitely malleable. Buildings do in fact get changed, but the high costs of change, understood by all, serve to dampen the whims of the changers.

  • Invisibility.

    Software is invisible and unvisualizable.

    As soon as we attempt to diagram soft- ware structure, we find it to constitute not one, but several, general directed graphs, superimposed one upon another. The several graphs may represent the flow of control, the flow of data, patterns of dependency, time sequence, name-space rela- tionships. These are usually not even planar, much less hier- archical. Indeed, one of the ways of establishing conceptual control over such structure is to enforce link cutting until one or more of the graphs becomes hierarchical.

    In spite of progress in restricting and simplifying the structures of software, they remain inherently unvisualizable, thus depriving the mind of some of its most powerful conceptual tools. This lack not only impedes the process of design within one mind, it severely hinders communication among minds.

  • Past Breakthroughs Solved Accidental Difficulties

Footnotes:

1

Backus, J. W., "The syntax and semantics of the proposed international algebraic language." Proc, Intl. Con/. Inf. Proc. UNESCO, Paris, 1959, published by R. Oldenbourg, Munich, and Butterworth, London. Besides this, a whole collection of papers on the subject is contained in T. B. Steel, Jr. (ed.), Formal Language Description Languages for Computer Programming. Amsterdam: North Holland, (1966).

2

Backus, J. W., "The syntax and semantics of the proposed international algebraic language." Proc, Intl. Con/. Inf. Proc. UNESCO, Paris, 1959, published by R. Oldenbourg, Munich, and Butterworth, London. Besides this, a whole collection of papers on the subject is contained in T. B. Steel, Jr. (ed.), Formal Language Description Languages for Computer Programming. Amsterdam: North Holland, (1966).

3

Iverson, K. E., A Programming Language. New York: Wiley, 1962, Chapter 2.

4

Iverson, K. E., A Programming Language. New York: Wiley, 1962, Chapter 2.

5

Bell and Newell have proposed new notations for describing both configurations and machine architectures, and they have il- lustrated these with several machines, including the DEC PDP-8,6 the 7090,6 and System/360.7

6

Bell, C. G., private communication.

7

Nanus, B., and L. Farr, "Some cost contributors to largescale programs," AFIPS Proc. SJCC, 25 (Spring, 1964), pp. 239-248.

8

Morin, L. H., "Estimation of resources for computer programming projects," M. S. thesis, Univ. of North Carolina, Chapel Hill, 1974.

9

The matter of design change is complex, and I oversimplify here. See J. H. Saltzer, "Evolutionary design of complex systems," in D. Eckman (ed.), Systems: Research and Design. New York: Wiley, 1961. When all is said and done, however, I still advocate building a pilot system whose discarding is planned.

10

See also J. W. Pomeroy, "A guide to programming tools and techniques," IBM Sys. J., 11, 3 (1972), pp. 234-254.

11

Wirth, N., "Program development by stepwise refinement," CACM 14, 4 (April, 1971), pp. 221-227. See also Mills, H. "Top-down programming in large systems," in R. Rustin (ed.)- Debugging Techniques in Large Systems. Englewood Cliffs, N.J.: Prentice-Hall, 1971, pp. 41-55 and Baker, F. T., "System quality through structured programming," AFIPS Proc FJCC, 41-1 (1972), pp. 339-343.

12

Dahl, O. J., E. W. Dijkstra, and C. A. R. Hoare, Structured Programming. London and New York: Academic Press, 1972. This volume contains the fullest treatment. See also Dijkstra's germinal letter, "GOTO statement considered harmful," CACM, II, 3 (March, 1968), pp. 147-148.

13

Bohm, C., and A. Jacopini, "Flow diagrams, Turing machines, and languages with only two formation rules," CACM, 9, 5 (May, 1966), pp. 366-371.

14

A good treatment of development of specifications and of system build and test is given by F. M. Trapnell, "A systematic approach to the development of system programs," AFIPS Proc S/CC, 34 (1969) pp. 411-418.

15

A real-time system will require an environment simulator. See, for example, M. G. Ginzberg, "Notes on testing realtime system programs," IBM Sys. /., 4, 1 (1965), pp. 58-72.

16

The essay entitled "No Silver Bullet" is from Information Processing 1986, the Proceedings of the IFIP Tenth World Computing Conference, edited by H.-J. Kugler (1986), pp. 1069-76. Reprinted with the kind permission of IFIP and Elsevier Science B. V., Amsterdam, The Netherlands.

Author: Shi Shougang

Created: 2015-03-05 Thu 23:21

Emacs 24.3.1 (Org mode 8.2.10)

Validate